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Abstract 

We consider a time-dependent linear diffusion equation together with a related inverse 
boundary value problem. The aim of the inverse problem is to determine, based on observations 
on the boundary, the non-homogeneous diffusion coefficient in the interior of an object. The 
method in this paper relies on solving the forward problem for a whole family of diffusivities by 
using a spectral Galerkin method in the high-dimensional parameter domain. The evaluation 
of the parametric solution and its derivatives is then completely independent of spatial and 
temporal discretizations. In case of a quadratic approximation for the parameter dependence 
and a direct solver for linear least squares problems, we show that the evaluation of the 
parametric solution does not increase the complexity of any linearized subproblem arising from 
a Gauss-Newtonian method that is used to minimize a Tikhonov functional. The feasibility of 
the proposed algorithm is demonstrated by diffusivity reconstructions in two and three spatial 
dimensions. 


1 Introduction 

Inverse boundary value problems arise in situations where one tries to find information about 
interior properties of an object by boundary measurements. In electrical impedence tomography 
(EIT), for example, an electric current is injected into the body and the corresponding voltages 
are measured across the boundary [8]. The aim is to reconstruct the electrical conductivity as 
a function, or to locate conductivity anomalies having prescribed (e.g., constant or vanishing) 
conductivity values. The inverse problem is nonlinear and ill-posed, whereas the forward prob¬ 
lem, namely, determining the boundary voltages when the conductivity and current patterns are 
given, is governed by a well-posed elliptic partial differential equation. Mathematically equivalent 
applications include electrical capacitance tomography [41]. 

In this paper, we consider the inverse boundary value problem of a time-dependent diffusion 
equation. This could be a model for thermal tomography, where the thermal diffusivity is to be 
reconstructed [6]. We assume that the heat flux at the boundary, as well as the initial temperature, 
are controlled and that the boundary temperatures are measured. The diffusivity is assumed to be 
time-independent. Unlike in the EIT problem, where the steady-state voltages are measured for 
several current patterns, in this paper we treat a single (or a few) initial and boundary conditions 
and measure the boundary temperatures at several instances of time. Similar inverse boundary 
value problems for stationary inclusion-type diffusivities have been examined in, e.g., [10, 11, 
24, 27]. In [29], both heat capacity and thermal conductivity were reconstructed simultaneously 
by using a least squares approach with several boundary conditions. In [42], an extension to 
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unknown surface flux was proposed and the feasibility of the method was verified with experimental 
three-dimensional data in [43]. The case of time-varying inclusions has been studied in [17]. For 
theoretical treatment of general inverse parabolic problems, we refer to [28]. 

Our approach to the inverse problem is based on writing the forward problem as a parametric 
differential equation. To that end, the temperature is viewed as a function depending not only 
on spatial and temporal variables, but also on parameters that define the diffusivity function. 
For the high-dimensional parameter domain, we adopt the spectral Galerkin method when the 
numerical solution is sought. The spatial domain is discretized with the finite element method. 
The combination was dubbed stochastic Galerkin finite element method in, e.g., [4], where the 
parameters were interpreted as random variables. In that context, the spectral discretization 
is often called (generalized) polynomial chaos. Since [19], both Galerkin and collocation methods 
have been thoroughly studied and analyzed for several kinds of uncertainty quantification problems, 
including the time-dependent random diffusion equation [5, 33, 35, 40, 46, 47, 48]. 

The time integration is performed by using an additive semi-implicit Euler method, a spe¬ 
cial case of so called implicit-explicit (IMEX) methods [2]. In particular, in [49] these methods 
were proposed for stiff systems resulting from hypersonic transient reactive flows. Although only 
first-order in time, in [48] the method was shown to be unconditionally stable for various discretiza¬ 
tions of parametric and stochastic diffusion equations. We demonstrate that when using locally 
supported functions to represent the diffusivity, the semi-implicit method is memory-optimal in 
the sense that storing and solving the resulting linear system requires strictly less space than the 
solution itself. 

After solving the parametric forward problem, obtaining the boundary values for different 
diffusivities is merely a task of polynomial evaluation. Indeed, in this paper we show that for a 
second-order approximation of the parameter dependence, the evaluation costs of the parametric 
solution and its Jacobian matrix do not increase the overall complexity of the inverse problem, 
if a (regularized) least squares minimization scheme based on Gauss-Newton method and QR 
decomposition is used to And the solution. These minimization schemes include the standard 
Gauss-Newton method as well as its trust region counterparts, which from computational point 
of view are very similar. Higher-order approximations increase the computational burden, but 
the workload of the inverse problem is still completely independent of the spatial and temporal 
discretizations of the forward problem. However, for large-scale nonlinear inverse problems it is 
arguably recommendable to resort to conjugate gradient iterations instead of QR decompositions 
when solving the linearized subproblems (see, e.g., [23]). The analysis of combining polynomial 
approximations and iterative linear least squares solvers is left for future studies. 

An algorithmically similar inversion approach, albeit with Bayesian paradigm, was recently 
carried out for time-independent EIT problem in [22]. To our knowledge, this is the first time 
when a parametric spectral solution to the forward problem is utilized for solving an inverse 
parabolic boundary value problem. The numerical results show that the method is capable of 
reconstructing the diffusion coefficient based on boundary measurements. Naturally, the quality 
of the reconstruction depends on the noise level of the measurement. In addition, the method is 
more suitable for smooth diffusivities. The distinctive feature of the approach is that once the 
parametric solution is saved, it takes only a few seconds to create the reconstruction from a given 
set of measurements. Indeed, the obvious advantage of the method is that spatial and temporal 
discretizations of the forward problem do not affect the complexity of reconstructing the diffusivity. 

This paper is organized as follows. In section 2, the precise mathematical formulation of the 
model problem is given. Then we re-formulate the problem in a parametric sense and discretize the 
equation in spatial, parametric and temporal dimensions. Section 3 considers the inverse problem 
from the computational point of view. Numerical examples in two and three spatial dimensions 
are provided in section 4 and some conclusions are drawn in section 5. 
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2 Parametric forward model 

2.1 Problem setting 

Let fl C d > 2, be a bounded domain with Lipschitz boundary dfl and exterior unit normal 
fi G . Furthermore, let T > 0 be given. The parabolic initial/boundary value problem 

considered in this paper is to find u: 1 ? x [0, T] —> R satisfying 

(dtU-V ■ (aVu) = / in X (0,T), 

<aVu-h = g on9l7x(0,T), (1) 

[m = Mo in 12 X {0}, 

where / € L^((0, T); L^(l7)) and g € L^((0, T); denote the interior source and the 

boundary flux, respectively, and mq € is the initial condition. The diffusivity (or diffusion 

coefficient) a belongs to L“(l7), where 

L“(17) := {a G L°°{f2) \ ess inf a > 0}. (2) 

In particular, the diffusivity is assumed to be independent of the time variable t. The variational 
formulation of the problem ( 1 ) is to find u such that 

{dtu, + (uVm, = (/, + {g, 7 m), (3) 

accompanied with an appropriate initial condition, holds for all v G H^([2) and for almost every 
t G (0,r). Here, 7 : H^{Q) —> denotes the trace operator and (•, •) is the duality pairing 

between and We refer to solving u from (1) or (3) as the forward problem. 

For the variational form (3) there exists a unique solution u satisfying 

M G L"((0,r); H\n)) n C°{[0,T];L^{f2)) (4) 

(see, e.g., [14, Chap. XVIII]). 

By inverse problem we mean finding a diffusivity a G L“(12) such that the solution u of the 
corresponding forward problem matches with given data. More precisely, we consider some given 
boundary data U: 912x(0, T) —> R and the corresponding trace U{a) := ju G L^((0, T); 
of the forward solution. It is not obvious whether the equation U (a) = U admits a (unique) so¬ 
lution a; see [28, Chap. 9] for some related results with slightly different assumptions. In any 
case, the inverse problem is ill-posed in the sense that the solution (if it exists) does not depend 
continuously on the data U in any reasonable metric. 

A physical interpretation of the problem (1) could be that u represents the temperature and 
a is the thermal diffusivity in an object 12 which has unit heat capacity. The setting is easily 
extended to a non-homogeneous but known heat capacity, in which case a denotes the thermal 
conductivity [ 6 ]. Although we restrict ourselves to the model problem (1), the presented methods 
are, with minor changes, widely applicable to other types of problems. For example, different 
boundary conditions, including Dirichlet, Robin and mixed, can be handled easily. Many of the 
observations apply to elliptic parametric problems as well. 

2.2 Parametrization of the diffusivity 

In order to treat the inverse problem U (a) = U numerically, we assume that the diffusivity is 
characterized by a finite number of parameters and an injective mapping a: 0 —> L“( 12 ), where 
0 C R^ is a high-dimensional parameter domain, is given. Besides this, our aim in this section 
is to establish an explicit parameter dependence for the solution u as well. That is, we seek a 
numerical solution to the forward problem in the form m: 12 x [0, T] x 0 —> R. The existence and 
uniqueness (similar to (4)) of such solution is obvious. 
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The most general methods for obtaining the parametric forward solution are based on solving 
the regular problem (1) or (3) for a set of collocation points in the parameter domain. The full 
parametric solution would then be written by either using the collocation points as quadrature 
nodes and projecting the solution to a chosen basis by numerical integration, or by interpolating 
the solution to the whole domain 0 by using a high-dimensional interpolation rule. In both cases, 
the number of collocation points needed is typically very high and the construction of the points 
is often based on sparse Smolyak grids [3, 36]. A related strategy is to use Galerkin method with 
double orthogonal polynomials, which also results in a decoupled system of equations [16]. 

In this paper, we discretize the parameter domain by the spectral Galerkin approach that 
yields a large coupled system. We introduce a positive weight function w € L^{0) and employ the 
corresponding weighted spaces. The variational form of the forward problem then becomes to 
find m: 17 X [0, T] X 6 > —)■ R such that for all v € L^(0; i7^(l7)) and a.e. in (0,T) 

(dtu,v)Li{0-,L^(O)) + (aVu, Vw)i2^(e.[L2(j7)]d) = + {{gjjv}), (5) 

where ((•,•)) denotes the duality pairing between the weighted spaces i7“^/^(9l7)) and 

i7^/^(9l7)). Here, the functions / and g are assumed to be constant with respect to the 
parameters. We assume that there exists a unique solution 

uGLI (0; L2((0, T); H^H)) n G°([0, T]; L^{n))). 

This is guaranteed, for example, if 0 is bounded and there exist positive constants Umin and Omax 
such that 

ainin ^ a(x, ^ Umax 

for almost every (x,i9) G f2 x 0. 

A convenient way to define the parametrized diffusivity a: 0 —> L“(l7) is to write 

p 

a(^) = X! 

p^l 


for G 0, or slightly more generally 


p 

a{'d) = a + '^'dp-4)p (7) 

p=i 

for some “background diffusivity” d G L“(f7). In this paper, we resort to (6) and choose {V'p}p=i C 
to be nonnegative and such that they form a partition of unity. Hence, the resulting 
diffusivity a is bounded by the extremal values of the parameters, i.e., 

inf min 7)^ < a(^) < sup max in 17 

i?eei<p<p ^e0i<p<P 

for all G 0. Naturally, the positivity requirement (2) imposes restrictions on choosing the 
parameter domain 0 once the functions 'ipp are fixed. 

One possible family {'4’p}p=i is a set of B-splines, which for a rectangular domain 17 are easy 
to construct as tensor products of univariate splines. Let us briefly recap the basic properties of 
B-splines, following [26]. A standard univariate uniform B-spline of degree s G N can be defined 
recursively as a convolution 


bs{x) 



bs-i{x 


y)bo{y) dy, 
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where bn{x) is the indicator function of the interval [0,1). We see that bg G C®“^(R) and supp(5s) = 
[0, s + 1]. The splines bs{x + s),..., bs{x) form a partition of unity on the interval [0,1). The 
multivariate splines can be constructed as 

d 

bs,d{x^ .— bgi^Xi) 

i=l 

and again bg^d G Transforming the standard splines to form a desired partition of unity 

is elementary. See [26] for constructing splines for non-rectangular domains. For polygonal and 
polyhedral domains one can also use piecewise linear functions such as those used in finite 
element solvers. 

It can be shown that if u and u are two forward solutions corresponding to arbitrary diffusivities 
a and a, respectively (in L“(l7), but not necessarily of the form ( 6 )), then there exists Ci > 0 
such that 

I|m - u\\l^{{0,T)-L^(Q)) < C'lllo - a||Loo(j7), 

see, e.g., [25]. On the other hand, the approximation error result for tensor product B-splines 
{tpp}p^i states that if S contains all the functions of the form ( 6 ) with 0 = R^, then 

inf [jo - a||ioo(^ 2 ) < C' 2 K®+^ rnax \\d^+^a\\L^(^o) 

(Xto L^z^ci 

for any diffusivity a G L°°{Q) and spline degree s > 0, where C 2 {s) > 0 and n ~ is a 

characteristic distance between spline knots [13]. 

Substituting ( 6 ) into (5) leads to 


p 

{dtu, v) + '^{ipippS/u, Vu) = (/, u) + {{g, jv)), ( 8 ) 

p=i 

where we have dropped the subscripts for brevity, and where the projection 6 p: 0 —> R is defined 
by = "dp. In the following, we discretize the spaces T^(0) and H^{n) in order to recast ( 8 ) 

as a matrix equation, which is then solved by performing a finite difference discretization of the 
temporal domain. 

For a separable Hilbert space H, the isomorphism H) ~ L^{0) ® H holds, if T^(0) is 

also separable (see, e.g., [40, Sec. B.3] and references therein). Now if {(j)i} and {g^j} are countable 
bases for and L^(0), respectively, then {4>i^Pj} forms a basis for L^(0; iJ^(f2)). This 

suggests looking for a numerical solution to the parametric forward ( 8 ) problem in the form 

M N 

UM,N{x,t,'d) = 

^=l j=l 

(where the finite-dimensional bases are not necessarily subsets of the aforementioned countable 
bases) with the time-independent test functions v G T^(0; iF^(l7)) having the form Vij = 

The semi-discrete equation for the parametric forward problem ( 8 ) can now be written as 

dtBu{t) + Au(t) = r{t), (9) 

where u : (0,T) —>■ R^^ denotes the vector of unknown coefficients 

u{t) := [ui^i{t),U2,l{t), . . . . . . ,UM-l,Nit),UM,N{t)]'^ , (10) 

B G rmnxmn is the parametric mass matrix defined by its entries 


(11) 
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and A € parametric stiffness matrix satisfying 

p 

^i,j,k,l = '^{l'pV’j,‘Pl)Ll(0)ii’p'^<Pi,'^4>k)L^(a)- ( 12 ) 

P=1 

The vector f: (0,T) —> on the right-hand side is 

fk,l{t) = {l,(pi)Ll{ 0 ){{f{t),(l)k)L^Q) + {g{t),(j)k))- (13) 


2.3 Finite element discretization 

We construct a finite-dimensional subspace Vh C H^{Q) by following the standard finite element 
method (FEM) procedures [32]. The finite element basis functions are denoted by {(t>i]iLi and 
h > 0 is the mesh size parameter of a regular enough mesh. The contribution of the spatial parts 
in (11) can be thought of as the symmetric positive-definite mass matrix B* G defined by 

Bi^k = {(t>iAk)L‘^{n)- 


For a non-homogeneous heat capacity, the elements of the mass matrix should be modified accord¬ 
ingly. 

The spatial term in (12) involves symmetric positive-semidefinite matrices of the stiffness type 




Note that the individual matrices may be very sparse and rank-deficient if the functions ipp 
are locally supported. Let us denote by nnz(F) the number of nonzero elements of an arbitrary 
matrix V and introduce a sparsity quantity 


Ep=innz(24(p)) 

miz(A*) 


(14) 


where A* G is the standard stiffness matrix defined by A*/. = {W(j>i,W(f>k)- Clearly, 

1 < rj < P. Note also that 

'^A^P^ = A\ 

p=i 

due to the partition of unity property of {V’p}^=i- 

We do not consider the convergence with respect to h in detail. However, recall that the familiar 
a priori error estimate 

\\u — Uh\\m{r2) < Ch'' ^||u||ppr(^2) 

for the numerical solution Uh of a canonical elliptic problem depends on r > 1, for which 


r < supjq I u G iL*(i7)}, 

9eR 

and for which the employed finite element type also sets an upper bound. Since for a noncontinuous 
coefficient function (corresponding to zeroth order splines and s = 0), the solution u is not in 
the convergence becomes sublinear and thus deteriorates even for piecewise linear FEM 
basis functions. Similarly, higher order FEM basis functions should be used only if the spline degree 
is high enough. In practice, however, the cardinality of the finite element space is much larger than 
the number of splines. Thus, the solution may converge rapidly regardless of the nonsmoothness of 
the diffusivity. The compatibility of the diffusivity representation and finite element discretization, 
in terms of forward computations, is discussed in [34]. 
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2.4 Spectral Galerkin method 

Next, we discretize the space by introducing an A^-dimensional subspace W C L^(6>). 

We adopt the spectral Galerkin method, and therefore choose basis functions that are 

orthogonal in the L^-sense. For convenience, we also assume normality, thus 

= S],l, 

where 5 is the Kronecker delta. It immediately follows that the matrix corresponding to the 
parametric part of (11) is the identity matrix I G therefore B = I ® B*. 

We only consider hyperrectangular parameter domains, and for simplicity assume that the 
domain is a hypercube 0 = for some interval if C R. The P-variate basis functions can then 
be easily constructed as tensor products of univariate polynomials {<^r}r=o> where n G Nq is the 
maximum univariate degree. More precisely, we define 

p 

P^l 

for 7 = 1,..., A^. In addition, the weight function is assumed to be separable in the sense that 

p 

p=i 

(Often we may even have Wp = w for all p = 1,.. ., P.) The univariate degrees Aj^p can be stored 
in a matrix A G which has N (yet unspecified) distinct rows. For convenience, we assume 

_1 /o 

that the first row contains only zeros, that is, pi = 1111111 ^ 1 ( 0 ) is the constant polynomial. 

Due to orthogonality, all but the first block in the right hand side vector (13) vanish. The 
stiffness matrix (12) can be written as 

p 

p^l 

where the symmetric matrices of size N x N involving the actual parameters are defined by 

Yjf = ih‘Pp‘Pl)Ll(0) 


for p = 1,..., P. 

For a fixed univariate degree n, the largest possible degree matrix A contains TV = (n + 1)^ 
rows, which is too much for most practical purposes. A widely used alternative is to limit the row 
sum of the degree matrix to equal n. This results in a so called total degree polynomial space, 
which is spanned by 


[p=i 


p=i 


'p<n 


We collect some combinatorial results related to the total degree space in the following theorem. 
Hereafter, a square matrix V whose diagonal entries are replaced by zeros, is denoted by ofFdiag(V). 


Theorem. Let A G be a degree matrix eorresponding to the polynomial basis of a total 

degree space of P variables and a total degree n, where 1 < n <C P. Then 


(a) 


N = 


P n 
n 


Pin! n! ' ' 
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(b) 


nnz(yl) = P 


P + n— 1 


Pn P" 

-N = 


n — I J P + n (n — 1)! 


+ C>(P"-^) 


(c) 


2 

nnz(offdiag(y^^^)) = — nnz(A) = - - 

P (n — 1)! 


and the positions of the nonzero elements o/ofFdiag(y^^^) are disjoint for I < p < P. 
Proof. We skip the proof here but refer to, e.g., [15, 19] for similar results with proofs. 


□ 


In what follows, we will use the result nnz(A) = (!1(P") from part (b) of the theorem. This is, 
of course, quite elementary consequence of part (a), since the definition of the total degree space 
immediately yields nnz(yl) < nN. Note that the quantities in parts (a) and (b) of the theorem 
are symmetric in terms of P and n (as is part (c) after summing over p). Thus, the asymptotic 
formulae for the case n —> oo can be obtained by switching the roles of P and n. 

Some alternatives for the total degree spaces can be found in [12]. We also mention that due to 
the locality of the splines, it seems reasonable to ignore some cross-terms including variables that 
correspond to coefficients of splines being supported far away from each other. This idea is not 
completely heuristic, see the asymptotic formula in elliptic case with small inclusions [1, Chap. 5]. 
We have also numerically observed this kind of behaviour. 

Ordinarily, one uses Legendre polynomials and a constant weight function, but other choices 
are possible as well. We refer to [18] for discussion on orthogonal polynomials and [9] for more 
detailed analysis of different types of spectral approximations. The convergence rate of spectral 
approximation depends on the smoothness of the forward solution with respect to the parameters. 
The affine representations (6) and (7) are known to result in an analytic parameter dependence 
in 17 X (0,T), see [25, 35]. On the other hand, when the number of parameters is large, it is 
only possible to use a low polynomial degree and thus the asymptotic convergence is not the main 
interest. Finally, we mention that the rest of this paper is equally applicable to the case where 
the parameter dependence is approximated with a spectral collocation method. In some cases, for 
example when the diffusivity parametrization is based on a boundary curve between two different 
diffusivity values, the collocation method may be significally more straightforward to formulate 
and implement. 


2.5 Time integration 

Let us continue by discussing how the semi-discrete equation (9) can be solved in time. In principle, 
any time integration method can be used to solve the resulting system of ordinary differential 
equations. However, diffusion equations typically require very small time steps if an explicit time 
integration method is used. This applies to parametric equations as well, see, e.g., [38] for some 
eigenvalue bounds. Implicit methods, on the other hand, require the solution of a full system of 
size MN having a nontrivial sparsity structure. 

Explicit methods with larger stability regions for parabolic equations can be derived from 
the class of so called Runge-Kutta-Chebyshev methods [45]. An alternative to explicit and im¬ 
plicit methods are semi-implicit or implicit-explicit (IMEX) methods [2, 49]. Here we resort to 
a first-order semi-implicit Euler method, which in [48] was shown to be unconditionally stable 
for parametric diffusion equations with symmetric Jacobi weights (in particular, with a constant 
weight and Legendre polynomials). We already know that the mass matrix B is block-diagonal and 
actually has constant blocks. The feasibility of the semi-implicit method is based on the diagonal 
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dominance of the matrices and the decomposition 

p 

A = fiI<»A' + ^(offdiag(y(^’)) 0 A^p'>) =:D + S, 

p=i 

where D G '[iMNxmn jg ^ block-diagonal matrix with constant blocks and /x > 0. This decom¬ 
position follows somewhat directly if the diffusivity is parametrized as in (7) with a = /x, see also 

[ 15 ] . 

Let us denote := u{k6), where k gN and 5 > 0 is a time step. Furthermore, let 

1 Ak+1)5 

^(fc+l/2)_M 

0 JkS 

be the mean value of the right hand side vector (13) over one time step. Starting from an initial 
vector which similar to f(t) contains in general only Af nonzero values, the semi-implicit 

Euler method presented in [48] can be written as 

(B + = (B- SS)u^’^^ + (16) 

Let us define 

■■■ Ui,N{t) 

mat{u{t)):= \ : G 

UM,l{t) ■■■ UM,N{t)_ 

as the matricization of the vector u(t) defined in (10). Correspondingly, we define the vectorization 
vec(mat('ii(t))) := u{t) G . Due to the block structure of the matrices B and D, the algorithm 

(16) can be efficiently implemented based on the following rules: 

1. Compute S ~ B* mat(M('=)). 

2. Compute ^ = vec(S) — + (5f('=+i/^). 

3. Solve r G from (B* -H SfiA')Y = mat(^). 

4. Set m('=+i) = vec(r). 

The solution to the system in the third step can be obtained by explicitly storing the inverse of the 
(time-independent) matrix B* -|- S^A* or storing the Cholesky factor and performing triangular 
substitutions. For large M, iterative methods such as the deflated conjugate gradient method may 
also be used [39]. 

The matrices B* and A* have 0{M) nonzero elements and their sparsity structure resulting 
from standard FEM discretization is well-studied and can be exploited. The only matrix of size 
MN X MN that has to be stored in the proposed algorithm is S, which is very sparse. Indeed, 
theorem 2.4 and the definition (14) yield 

nnz(S') = y nnz(offdiag(Y(P)))nnz(A(P)) = 2njV^nnz(A*) ^ 

P + n 

p=i 

If rj = 0{P), then nnz(S) = 0{MN). However, if P is large and the functions x/^p are splines 
or otherwise supported on a small region, then rj = 0(1) and the matrix S has essentially less 
nonzero elements than the vectors «('=) for k > 0. 

Naturally, the vectors ■u('=) can be discarded after computing the next solution ■u('=+^). Thus, 
arbitrarily small time steps can be used without imposing massive memory requirements, and 
the fact that the method is only first-order, is not necessarily an issue. As suggested in [48], 
the proposed method can be modified to perform a Jacobi iteration for parametric or stochastic 
elliptic equations. In addition, we note that preconditioned Krylov subspace methods for time- 
independent parametric or stochastic problems may employ structures that are similar to those 
presented here. See, e.g., [44] and the references therein for more information on that subject. 
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3 Estimating parameters from boundary data 

In this section, we consider the inverse problem of determining the diffusion coefficient from bound¬ 
ary measurements. As presented in section 2.1, the continuous formulation of the inverse problem 
is to find a diffusivity a G such that ju, which is the trace of the solution corresponding 

to a, equals (or is close to) the measurement C/: df2 x (0, T) —> R. 

A practical measurement contains only finitely many, say Q, values. That is, we consider 
a measurement vector {7 G satisfying C/g « where u is the temperature and 

C df2 X (0,r) defines the physical coordinates of the observations. We denote the 
measurement only as an approximation of the temperature due to unavoidable errors in measure¬ 
ments and uncertainties in the problem setting. The parametric numerical solution corresponding 
to the coordinates can be written as U: O ^ R*^, which satisfies 

UW = : . (17) 

Formally, the inverse problem can now be written as a parameter estimation problem 

argmin||i7(i?) - i7||2, (18) 

where 11-112 denotes the Euclidean norm and the diffusivity a can be computed from (6). Due to 
the ill-posedness, however, the minimization has to be regularized in order to avoid meaningless 
reconstructions. 

3.1 Regularized nonlinear least squares 

Let us briefly sketch a simple Gauss-Newton algorithm with line search for a minimization problem 
of the type (18). Assume 0 = R^ for a moment. Starting from G 0 and fc = 1, the algorithm 
produces a sequence of parameter vectors according to the following steps: 

1. Solve a linear least squares problem 

/iopt := argminll -f - U\\l. 

zieR^ 

2. Solve a one-dimensional optimization problem 

Oopt := argmin||[7(i?('""^) -k a/^opt) - U\\l- 

a6R+ 

3. Set -I- ttopt-^opt and increase k by one. 

These steps are repeated until a suitable stopping criterion is satisfied. Here, Ju ■ R'^ —> is 

the Jacobian matrix of the mapping U . The linear least squares problem in the first step is usually 
solved by QR decomposing the Jacobian [20, 37], although it is also possible to make the algorithm 
more efficient by employing the conjugate gradient method in case P is large [23, 30, 31]. (As our 
spectral Galerkin method does not allow very large P, we exclude such considerations.) Assuming 
that the Jacobian Ju and the vector U have already been evaluated at the computational 

complexity of the QR decomposition is 0{QP^). After that, the search direction .^opt can be 
obtained easily with triangular substitution. The line search of the second step requires only few 
evaluations of U, since an approximate solution for Oopt is usually enough. 
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We refer to [37] for more detailed discussion about nonlinear least squares algorithms. If 
0 7 ^ we also need to specify and implement constraints, which we ignored above. A common 
alternative to the standard Gauss-Newton algorithm (or the one with line search) is the Levenberg- 
Marquardt method, which essentially consists of the same subproblems as Gauss-Newton: evalu¬ 
ating U and its Jacobian Ju, and solving a linear least squares problem involving the Jacobian. 

If the problem (18) is replaced by a Tikhonov regularized version 


argmin {|ji7(i9) -tj\\l + A^|jG(i 9 )|| 2 } = argmin 


U{-d) - U] 

. \ 2 ’ 


(19) 


the same principles still apply, assuming that the regularization function G is easy to evaluate and 
differentiate, e.g., G is a matrix. A Bayesian interpretation for the inverse problem often results 
in a similar minimization problem as (19), if the maximum a posteriori estimate is sought. It is 
also possible to employ more advanced methods such as the iteratively regularized Gauss-Newton 
method and Lepskii balancing principle [7], while still having essentially the same subproblems. 


3.2 Evaluating polynomials and derivatives 

The nonlinear mapping U: 0 ^ in (17) can be decomposed as 

U{^) = 

where the matrix V G R*^^^ satisfies 


M 

i=l 


( 20 ) 


and the nonlinear part 0 —> R^ is defined in (15). The matrix V can be constructed in 
advance, that is, before performing any measurements or optimization. On the other hand, the 
multivariate polynomials can be evaluated according to 

ipji'd) = (^o)^”'^"' n 

peCj 


where 

{P I 

contains the indices to non-constant univariate polynomials. For a total degree space, the number 
of such indices must be \Cj\ < n for each j = 1,..., iV. In this case, and assuming n = 0(1), the 
evaluation of (p requires 0{N) floating point operations. The total complexity of evaluating U is 
thus determined by the matrix-vector multiplication ( 20 ), which takes 0{NQ). 

The Jacobian matrix Ju: 0 —> R*^^^ can be written as 


Ju{J) = (21) 

where the basis Jacobian J^'. 0 —)■ R^^^ contains the partial derivatives of the multivariate 
polynomials. Note that for a general argument J G 0, the basis Jacobian has exactly the same 
sparsity structure as the degree matrix A defining the underlying polynomial space. For that 
reason, the evaluation of J^p becomes quite cheap. Indeed, for any 1 < j < A^ and p G Cj, we have 

qGCj\p 

and for p ^ £j, the corresponding entries in the basis Jacobian Jp vanish. Again, if \Cj\ = 0(1) 
for each j, the evaluation of Jp requires 0{N) operations. Due to the sparsity the matrix-matrix 
product in (21) takes only 0{NQ). 

From theorem 2.4 we immediately get the following: 
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Corollary. The computational cost of evaluating the parametric solution U and all its first-order 
partial derivatives is 0{P^Q), if the polynomial space of total degree n is used to discretize the pa¬ 
rameter domain. In particular, choosing n = 2 yields the same complexity as the QR decomposition 
in a Gauss-Newton step. 

In other words, a quadratic approximation of the parameter dependence does not pose a bottle¬ 
neck for the efficiency of a Gauss-Newton or similar minimization scheme. The result is nontrivial 
in the sense that a naive finite difference approximation of the Jacobian Ju would require P func¬ 
tion evaluations and thus have a complexity of 0{P"^^Q). Let us also emphasize at this point 
that the complexity of the inverse problem is completely independent of spatial and temporal dis¬ 
cretizations of the forward problem and it is also irrelevant whether the spectral approximation 
was obtained with the Galerkin or collocation method. 

The linearized subproblems in a Gauss-Newton scheme can also be solved by, e.g., a conjugate 
gradient iteration instead of the QR decomposition. For general complexity analysis of such nested 
iterations, see [30]. Extending the analysis to include a polynomial surrogate model is beyond the 
scope of this article. 

Notice that the forward parametric solution can be solved with a larger spectral basis than what 
is used when evaluating the solution and its Jacobian. This is useful, since it is difficult to choose 
in advance the optimal subset of multivariate polynomials. After having the parametric solution 
at hand, one can simply discard those polynomials that correspond to columns of the matrix V 
having smallest (Euclidean) norm. It is also possible to use different subsets of polynomials for cp 
and for Having fewer polynomials for the Jacobian results in the so called perturbed Gauss- 
Newton method [21]. In particular, it is even possible to solve a single regular forward problem at 
each step while approximating the Jacobian with the pre-computed polynomials. 

Finally, note that it is also possible to treat the inverse problem that is based on several 
different intitial conditions, boundary fluxes or other components of the problem setting. This 
merely requires stacking the corresponding matrices V on top of each other. 


4 Numerical examples 

The proposed method is demonstrated with simulated boundary data. First, we solve a two- 
dimensional parametric forward problem corresponding to P = 14^ = 196 bi-quadratic B-splines 
that are uniformly spaced so that they form a partition of unity on the unit square f2 = (0,1)^. 
We choose the parameter interval E = (1/2,2), constant weight w and employ polynomial space 
of total degree n = 2, resulting to fV = 19503 multivariate Legendre polynomials in accordance 
with part (a) of theorem 2.4. We assume zero initial condition ug = 0 and also set / = 0. For the 
horizontal boundaries, we assume homogeneous Neumann conditions, i.e., g\x 2 &{o,i} = Oj whereas 
the vertical boundaries satisfy g\xi=o = —20t and g\xi=i = 20t. This corresponds to the case 
where two sides of a square-shaped object are insulated and two sides are heated or cooled with 
a heat flux which is linear in time. The spatial discretization is performed with M = 37^ = 1369 
uniformly spaced piecewise linear FEM basis functions (corresponding to 2592 triangular elements) 
and for the semi-implicit Euler method we choose the time step 5 = 10“^. 

The boundary data is generated by using M = 129^ = 16641 FEM basis functions and the sec¬ 
ond order Grank-Nicolson time integration method with a step length 5 = 10“^. The measurement 
consists of Qs = 36 spatial points that are uniformly distributed across the boundary (including 
corners), and of Qt = 13 time instances G {0.01,0.05, 0.09,..., 0.49}. Thus, the measurement 
vector U has Q = QsQt = 468 elements. For each value, we add independent zero mean Gaussian 
noise realizations with standard deviation a = ag ■ maxi<j<Q Uj with some ag > 0. 

The reconstructions are computed by minimizing (19) with the Isqnonlin function of Matlab. 
The default algorithm trust-region-reflective handles bound constraints that are chosen to 
agree with the parameter domain 0 = . The regularization function G is chosen to be a 
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discretized Laplace operator (i.e., a matrix which in one dimension would be tridiagonal with 
values —1, 2 and —1) so that the minimization prefers smooth diffusivities. The regularization 
parameter A is set such that the Morozov discrepancy principle 

\\U{i))-Uh^VQ<^ ( 22 ) 


holds for the minimizing vector Because the minimization process is very fast, the adjustment 
of A can be done, for example, by trial-and-error. 

In the first two-dimensional example we consider the boundary data corresponding to a smooth 
diffusivity 


a(x) = L25+=<t£ll£2!<i£?) 


(23) 


which satisfies 0.75 < d{x) < 1.75 for all x G ^2. Figure 1 shows the target diffusivity together 
with the reconstruction. Here, the noise parameter is cto = 0.001 and for the regularization we use 
A = 0.025. We see that the quality of the reconstruction is good. A piecewise constant diffusivity 
taking values 1/2 and 3/2 is reconstructed in figure 2 by using the aforementioned values for the 
noise and regularization. As expected, the reconstruction is far from exact in this nonsmooth case, 
because the regularization favors smooth diffusivities and such a piecewise constant diffusivity is 
also impossible to represent by using bi-quadratic B-splines. 

Let us also consider the approximation error 




(24) 


where is the minimizing vector as in (22) and denotes the simulated high-accuracy noiseless 
(ctq = 0 ) boundary data that is obtained by choosing the coefficient d to be the just reconstructed 
diffusivity. This error results from truncating the polynomial expansion (i.e., having a finite N) 
and also from the less accurate spatial and temporal discretizations of the parametric solution. In 
the examples shown in figures 1 and 2, the error values (24) are 0.11 and 0.10, respectively, which in 
these low-noise cases are slightly larger than ^/Qa in (22). However, although this approximation 
error is not accounted for in (22), no instability of the reconstructions was observed in any of the 
numerical tests. 

Some other smooth target diffusivities and their reconstructions are shown in figures 3 and 4. 
Also in these cases the target diffusivity values lie between 1/2 and 2. Now the noise parameter is 
considerably larger, namely cto = 0.02. In order to approximately satisfy the Morozov discrepancy 
principle (22), we set A = 0.4. The reconstructions are still qualitatively correct. The approxima¬ 
tion errors (24) related to figures 3 and 4 are 1.20 and 0.13, respectively, which are now smaller 
than the noise level in (22). The larger error in the third example may be explained by the target 
diffusivity values that are quite low; the parametric solution is usually most accurate when the 
diffusivity values are in the middle of the parameter interval E. 

In our three-dimensional example we have P = 6^ = 216 trilinear B-splines that form the 
partition of unity in the unit cube 17 = (0,1)^. As in the two-dimensional case, we use E = (1/2,2), 
constant weight and n = 2, which now results in = 23653. The Neumann boundary conditions 
on two opposing sides are g\xi=o = —40f and 5|a:i=i = 40t, whereas the remaining four faces have 
homogeneous Neumann boundary conditions. The initial value and the forcing term are set to zero 
as in the two-dimensional case. In three-dimensional case, we use a piecewise linear finite element 
mesh with M = 26^ = 17576 nodes and 93750 tetrahedra. The time step of the semi-implicit Euler 
method is still S = 10“^. 

The three-dimensional measurement data is generated by using M = 65^ = 274625 piece- 
wise linear basis functions, Crank-Nicolson time integration with a time step of <5 = 10“^ and a 
diffusivity 

d{x) = 1.25 -I- (0.5 — X 3 ) sin(6a;i) cos(4x2). (25) 

The measurement consists of Qs = 152 spatial locations that are uniformly spaced across the 
boundary of the cube, and of Qt = 13 time instances as before, so that Q = QsQt = 1976. The 
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target reconstruction 



Figure 1: Two-dimensional target diffusivity d (23) on the left and the reconstruction a on the 
right (cto = 0 .001). 


target reconstruction 



Figure 2: Two-dimensional piecewise constant target diffusivity d on the left and the reconstruction 
a on the right (cto = 0 .001). 


noise model is the same as in the two-dimensional case with (Tq = 0.01 and for the regularization 
we choose A = 0.09. Figure 5 shows that the reconstruction is qualitatively correct in this three- 
dimensional setting. 

All computations were performed by using Matlab R2014b. Computationally the most demand¬ 
ing case is the three-dimensional parametric forward problem in which MTV « 4 • 10® and quite a 
large amount of memory is needed even to store the intermediate vector during the time integra¬ 
tion. On the other hand, the measurement consists of only Q « M/10 values and thus the matrix 
V G can be handled easily even in the three-dimensional case. For a fixed regularization 

parameter, the reconstruction itself took at most a few seconds in all numerical experiments on an 
up-to-date desktop computer. This is the only step that cannot be performed offline, that is, prior 
to the measurements. Further improvements could still be expected by implementing a customized 
least squares solver and fine-tuning the stopping criterion for the iterative optimization. Moreover, 
combining our method with a state-of-the-art iteratively regularized Gauss-Newton method with 
inner conjugate gradient iteration is left for future studies. 


5 Conclusions 

We have studied a time-dependent parametric partial differential equation and the related in¬ 
verse boundary value problem. The parametric forward problem was solved by using the spectral 
Galerkin method in the parameter domain, finite element method in the spatial domain and a semi- 
implicit Euler method in the time interval. The inverse problem was interpreted as a nonlinear 
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Figure 3: Two-dimensional target diffusivity a on the left and the reconstruction a on the right 

K = 0.02). 



Figure 4: Two-dimensional target diffusivity a on the left and the reconstruction a on the right 
(cto = 0.02). 


least squares problem accompanied with a Tikhonov regularization. 

It was shown that having a quadratic approximation for the parameter dependence results 
in an efficient inverse algorithm, where the evaluation of the multivariate polynomials and their 
derivatives does not constitute an essential bottleneck for a Gauss-Newtonian least squares method, 
if QR decompositions are used for solving the linear subproblems. In particular, the inverse problem 
can be solved independently of the physical discretization of the forward problem. The numerical 
results indicated that the quadratic approximation is indeed accurate enough for qualitatively 
correct reconstructions. See [30, 31] for alternative approaches that are more efficient if the number 
of parameters is large and the QR decomposition becomes too expensive. 

The proposed method is versatile and can be applied to elliptic problems, including FIT, as 
well. The future research will concentrate on different regularization techniques such as sparsity 
promoting terms that do not satisfy the differentiability assumptions posed here. In addition, 
different parametrizations will be studied. 
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Figure 5: Left: slices of the three-dimensional target diffusivity (25) in the unit cube. Right: 
corresponding slices of the reconstruction (cto = 0 .01). 













































